Automatic Construction of Movie Domain Korean Sentiment Dictionary Using Online Movie Reviews

نویسندگان

  • Heeryon Cho
  • Sang-Hyun Choi
چکیده

We present a method of automatically constructing a domain-specific Korean sentiment dictionary which can be used to classify the sentiment of online movie reviews. More than 1.18 million online movie reviews with movie ratings ranging between 1 to 4 and 7 to 10 were collected across fourteen different movie genres to calculate the joint probability of a given word and the sentiment of movie reviews for each genre. In particular, the joint probability of (1) a given word and the positive movie reviews that contain movie ratings 7 to 10 and (2) a given word and the negative movie reviews that contain movie ratings 1 to 4 for each movie genre were calculated. The difference between the two joint probabilities (i.e., (1) – (2)) was obtained for each word in each genre, and the fourteen genres’ joint probability differences of each word were averaged. Finally, the averaged joint probability difference values were normalized to range between -1 and 1. These normalized values were utilized as the sentiment values of each word in the final 135,082-word movie domain Korean sentiment dictionary. The positive/negative binary sentiment classification performance of the constructed sentiment dictionary was evaluated using test data, and the balanced accuracy of 80.7% was achieved, confirming the effectiveness of the proposed sentiment dictionary construction method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

YouTube Movie Reviews: In, Cross, and Open-domain Sentiment Analysis in an Audiovisual Context

In this contribution we focus on the task of automatically analyzing a speaker’s sentiment in on-line videos containing movie reviews. In addition to textual information, we consider adding audio features as typically used in speech-based emotion recognition as well as video features encoding valuable valence information conveyed by the speaker. We combine this multi-modal experimental setup wi...

متن کامل

Machine Learning-based Sentiment Analysis of Automatic Indonesian Translations of English Movie Reviews

Sentiment analysis is the automatic classification of the overall opinion conveyed by a text towards its subject matter. This paper discusses an experiment in the sentiment analysis of of a collection of movie reviews that have been automatically translated to Indonesian. Following [1], we employ three well known classification techniques: naive bayes, maximum entropy, and support vector machin...

متن کامل

Sentiment Classification and Feature based Summarization of Movie Reviews in Mobile Environment

A new framework is designed for sentiment classification and feature based summarization system in a mobile environment. Posting online reviews has become an increasingly popular way for people to share their opinions about specific product or service with other users. It has become a common practice for web technologies to provide the venues and facilities for people to publish their reviews. ...

متن کامل

Sentiment Analysis of movie reviews using SentiWordNet Approach

In this paper, a new kind of domain specific feature-based heuristic for sentiment analysis of movie reviews using aspect-level is presented. The unsupervised learning technique for sentiment classification is used. The SentiWordNet based scheme using two different linguistic feature selections containing adjectives, adverbs and verbs and n-gram feature extraction is performed. In aspect orient...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015